Cost models for distance joins queries using R-trees
نویسندگان
چکیده
The K-Closest-Pairs Query (K-CPQ), a type of distance join in spatial databases, discovers the K pairs of objects formed from two different datasets with the K smallest distances. Recently, branch-and-bound algorithms based on R-trees have been developed in order to answer K-CPQs efficiently. For query optimization purposes, analytical models are needed to estimate the processing cost of a specific query in order to evaluate alternative execution plans. In this paper, we combine techniques that have been used for the analysis of nearest neighbor and spatial join queries, and derive the performance cost (in terms of disk accesses) of K-CPQs using R-trees. Moreover, we present two interesting extensions of the cost model for K-CPQs, one exploiting the buffering management using R-trees and another for a second type of distance join, the socalled buffer queries. The proposed cost models are verified under a variety of distributions in 2-dimensional space on both synthetic and real datasets, shown to achieve accurate estimations of the measured experimental results. 2005 Elsevier B.V. All rights reserved.
منابع مشابه
KRDB Research Centre Technical Report : Value Joins are Expensive over ( Probabilistic ) XML . Extended
We address the cost of adding value joins to tree-pattern queries and monadic second-order queries over trees in terms of the tractability of query evaluation over two data models: XML and probabilistic XML. Our results show that the data complexity rises from linear, for joinfree queries, to intractable, for queries with value joins, while combined complexity remains essentially the same. For ...
متن کاملValue Joins are Expensive over ( Probabilistic ) XML . Extended Version
We address the cost of adding value joins to tree-pattern queries and monadic second-order queries over trees in terms of the tractability of query evaluation over two data models: XML and probabilistic XML. Our results show that the data complexity rises from linear, for joinfree queries, to intractable, for queries with value joins, while combined complexity remains essentially the same. For ...
متن کاملA Cost Model for Estimating the Performance of Spatial Joins Using R-trees
The development of a cost model for predicting the performance of spatial joins has been identified in the literature as an important and difficult problem. In this paper, we present the first cost model that can predict the performance of spatial joins using R-trees. Based on two existing R-trees (join targets), our model first estimates the number of expected I/Os for the join process by assu...
متن کاملSpatial Queries in the Presence of Obstacles
Despite the existence of obstacles in many database applications, traditional spatial query processing utilizes the Euclidean distance metric assuming that points in space are directly reachable. In this paper, we study spatial queries in the presence of obstacles, where the obstructed distance between two points is defined as the length of the shortest path that connects them without crossing ...
متن کاملComplex Spatial Query Processing
The user of a Geographical Information System is not limited to conventional spatial selections and joins, but may also pose more complicated and descriptive queries. In this paper we focus on the efficient processing and optimization of complex spatial queries that involve combinations of spatial selections and joins. Our contribution is manifold; we first provide formulae that accurately esti...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Data Knowl. Eng.
دوره 57 شماره
صفحات -
تاریخ انتشار 2006